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Abstract. We describe an improved version of the Hough transform search for continuous 
gravitational waves from isolated neutron stars assuming the input to be short segments 
of Fourier transformed data. The method presented here takes into account possible non- 
stationarities of the detector noise and the amplitude modulation due to the motion of the 
detector. These two effects are taken into account for the first stage only, i.e. the peak selection, 
to create the time-frequency map of our data, while the Hough transform itself is performed in 
the standard way. 



1. Introduction 

The Hough transform is a pattern recognition algorithm which was originally invented to analyze 
bubble chamber pictures from CERN pQ. It was later patented by IBM |2|, and it has found many 
applications in the analysis of digital images A detailed discussion of the Hough transform 
as applied to the search for continuous gravitational waves can be found in [1]. An example of 
such a search is in which this semi-coherent technique is used to perform an all-sky search 
for isolated spinning neutron stars using two months of data collected in early 2003 from the 
second science run of the LIGO detectors The main results of [H] are all-sky upper limits 

on a set of narrow frequency bands within the range 200-400 Hz and including one spin-down 
parameter. The best upper limit on the gravitational wave strain amplitude that we obtained 
in this frequency range is 4.43 x 10 -23 . Several searches have been completed or are underway 
using the same data. Examples of such searches are |H] which targets known radio pulsars at 
twice the pulsar frequency, and [5] that performs an all sky search in a wide frequency band, but 
using only about 10 hours of data and applying coherent (matched filtering) techniques. Such 
techniques are very computationally intensive for large parameter space searches. 

The ultimate goal for wide parameter space searches for continuous signals over large data 
sets is to employ hierarchical schemes |101 111! 1121 1131 1141 ITS] which alternate coherent and semi- 
coherent techniques. The Hough transform search would then be used to select candidates in 
parameter space to be followed up. It is crucial that the first stage of the search is as efficient 
as possible, since it determines the final sensitivity of the full pipeline. 

In this paper we present an improved implementation of the Hough transform with respect to 
what was described in jlj when this is applied to a set of Fourier transformed data (without any 



demodulations). This new implementation takes into account non-stationarities of the noise floor 
and also the amplitude modulation due to the antenna pattern that were previously ignored. 

The plan of the paper is as following: Section |2] described the waveforms we are looking for. 
Section El describes the basics of the standard Hough transform and in Section 0] how this could 
be improved in the presence of non-stationarities. 



2. The signal at the detector 

The output of the detector can be represented by 

x(t) = h(t) + n{t) , (1) 

where n(t) is the noise that affects the observation at a detector time t, and h(t) is the 
gravitational wave signal received at the detector: 

h(t) = F+(n, i>)h+{t) + F x (n, j)h x (t) . (2) 

Here h + and h x are the two independent polarizations of the strain amplitude at the detector 
and F+ and F x are the antenna beam pattern functions, that depend on the direction n to the 
star and also on the polarization angle ip. Due to the motion of the Earth, the F +tX depend 
implicitly on time. In case of a laser interferometer detector with perpendicular arms, the 
expressions for F + and F x are given by \i£ 



F + = a{t) cos 2%() + b(t) sin 2ip , F x = b(t) cos 2ip - a(t) sin 2ip , (3) 

where the functions a(t) and b(t) are independent of tp. 

We shall make the standard assumption that over a certain time-scale, the noise n(t) is 
stationary, although in realistic cases this hypothesis is likely to be violated at some level. 
Within this approximation, the Fourier components n(f) of the noise are statistically described 
by: 

E[h(f)h*(f)] = ±6(f-f')S n (f), (4) 

where E\\ denotes the expectation value with respect to an ensemble of noise realization, the * 
superscript denotes complex conjugate, S n (f) is the one sided noise power spectral density, and 
tildes denote Fourier transforms. 

The form of the gravitational wave emitted by an isolated pulsar depends on the physical 
mechanism which may cause the neutron star to emit periodic gravitational waves. The main 
possibilities considered in the literature are (i) non-axisymmetric distortions of the solid part of 
the star, (ii) unstable r-modes in the fluid, and (iii) free precession (or 'wobble') (see ^Z] and 
references therein) . If we assume the emission mechanism is (i) , then the gravitational waves are 
emitted at a frequency which is twice the rotational rate f r of the pulsar. Under this assumption, 
the waveforms for the two polarizations /i+, x are given by |16j : 

1 ~t~ COS^ L 

h + = A + cos$(t) = h cos<3?(t), h x = A x sin <3?(i) = h cos i sin $(t) , (5) 

where l is the angle between the pulsar's spin axis and the direction of propagation of the waves, 
and ho is the amplitude: 

h Q = ^ —. (6) 

Here G is Newton's gravitational constant, c the speed of light, I zz is the z-z component of the 
star's moment of inertia with the z-axis being its spin axis, e := {I xx — I yy )/I Z z is the equatorial 
ellipticity of the star, and d is the distance of the star from Earth. 



The phase &(t) takes its simplest form in the Solar System Barycenter (SSB) frame where it 
can be expanded in a Taylor series: 



*(t) = $ + 2vr J] 7-^Yv(T - T r +1 . (7) 

n>0 ^ 

Here T is time in the SSB frame and To is a fiducial start time. The phase $0, frequency /( ) 
and the spin-down parameters fr n \, n > 0, are defined at this fiducial start time. Neglecting 
relativistic effects which do not affect us significantly in this case, the relation between the time 
of arrival T of the wave in the SSB frame and in the detector frame t is 

m n • r 

T = t+ , 8 

c 

where n is the direction to the pulsar and r is the detector position in the SSB frame. The 
instantaneous frequency f(t) of the wave as observed by the detector is given, to a very good 
approximation, by the familiar non-relativistic Doppler formula: 

f(t)-f(t) = f(t)^^ (9) 

where v(t) is the velocity of the detector at the detector time t, f(t) is the instantaneous signal 
frequency at time t: 

/W = /(o) + E^r( T - T o) n - ( 10 ) 

n>0 

Equations Q and (flO|) describe the time-frequency pattern produced by a signal, and this is 
the pattern that the Hough transform is used to look for. 



3. Standard Hough transform 

The Hough transform can be used to find the pattern produced by the Doppler shift © and 
the spin-down of a gravitational wave signal in the time-frequency plane of our data. The 
parameters which determine this pattern are ({/( n )},n). The parameter space is covered by a 
discrete cubic grid and the result of the Hough transform is an histogram for each point of this 
grid. By "standard" implementation of the Hough transform, we refer to the implementation 
used in [H] where the starting point are TV" short segments of Fourier transformed data (which are 
called Short Fourier Transforms, or SFTs) with T coh being the time baseline of the SFTs. From 
this set of SFTs, calculating the periodograms (the square modulus of the Fourier transform) 
and selecting frequency bins (peaks) above a certain threshold p th , we obtain a time- frequency 
map of our data. Assuming a fixed threshold, the optimal value turns out to be p th = 1.6, 
corresponding to a peak selection probability of q = e~ Pth = 0.2 in the absence of a signal [I]. 
For each selected bin in the SFTs, we find which points in parameter space are consistent with 
it, according to Eq. Q, and the number count in all such points is increased by unity. This is 
repeated for all the selected bins in all the SFTs to obtain the final histogram. 

There are several criteria to select the frequency bins. One of them is to select by setting a 
threshold p th on the normalized power p^ defined as 

2\x k \ 2 

Pk - If, q / f \ , U-LJ 

J coh<JnUfcj 

where is the discrete Fourier transform of the data, the frequency index k corresponds to a 
physical frequency of fk = k/T coh , and S n (fk) is the single sided power spectral density of the 



detector noise. The k frequency bin is selected if pk > p t h, and rejected otherwise. In this 
way, each SFT is replaced by a collection of zeros and ones called a peak-gram. The Hough 
transform is used to calculate the number count n at each parameter space point starting from 
this collection of peak-grams, i.e. each point in parameter space corresponds to a pattern in the 
time-frequency plane of our data, and the number count n is the sum of the ones and zeros of 
the different peak- grams along this curve. 

Let p{n) be the probability distribution of n in the absence of a signal, and p(n\h) the 
distribution in the presence of a signal h(t). It is clear that < n < N, where N is the number 
of SFTs, and it can be shown that for stationary Gaussian noise, p{n) is a binomial distribution 
with mean Nq where q = e~ Pth is the probability that any frequency bin is selected: 

V{n)= ( ^ V(l-g) 

In the presence of a signal, the distribution p{n\h) is ideally also a binomial but with a slightly 
larger mean Nij where, for weak signals, 77 is given by 

r ] = q [l+ P -fX k + 0{\l)) . (13) 

Afc is the signal to noise ratio within a single SFT, and for the case when there is no mismatch 
between the signal and the template: 

>k _ MHfk)\ 2 

^coh Sri (,/fc) 

with h{f) being the Fourier transform of the signal h{t) (see [I] for details) 

Candidates in parameter space are selected by setting a threshold n th on the number count. 
For the case of large N, and Gaussian stationary noise, the relation between the the false alarm 
rate a and the threshold n th is independent of the signal strength and given by: 

n th = Nq+ ^2Nq{\ - q) erfc" 1 (2a) (15) 

where, erfc -1 is the inverse of the complementary error function. On the average, the weakest 
signal which will cross the above thresholds at a false alarm rate a and false dismissal (3 is given 
by 

ho = 5.34^^1^, 5 = erfc" 1 (2a)+erfc- 1 (2/3). (16) 
Equation (|16|) gives the smallest signal which can be detected by the search. 

4. Improved peak selection for the Hough transform 

The approximation that the distribution p(n\h) in the presence of a signal is binomial can break 
down for several reasons: the random mismatch between the signal and the template used to 
calculate the number count, non-stationarity of the noise, and the amplitude modulation of the 
signal which causes Afc to vary from one SFT to another and for different sky locations and 
pulsar orientations. The result of these effects is to "smear" out the binomial distribution in the 
presence of a signal as it is illustrated in Figure 14 in [2] . To account for these non-stationarities 
a possible Adaptive Hough Transform is discussed in |18j in which Palomba et al suggest not to 
increase the number-count by unity, but by a real number depending on the detector response 
function and the noise level. 

In this paper we propose an alternative method that consists in setting an adaptive peak 
selection threshold that varies for the different SFTs and sky locations, and then performing 



(12) 



(14) 



the Standard Hough Transform on the resulting peak-grams. We expect this method should 
have a somewhat similar gain in sensitivity as the method presented in ^SJ; but with a better 
performance since all the implementation tricks described in [3] and [H] Sec VI. C can still be 
applied. However, we emphasize that a proper statistical justification of this method is still 
unclear and is being investigated. 



4-1. Dealing with non-stationarity 

Let us consider the quantity p k X k /2, related to the second summand in right hand side of 
Eq. I)13|l . which, in the discrete case, can be rewritten as: 

PkXk \x k \ 2 \h(f k )\ 2 \x k \ 2 {E[\h k \ 2 ]) N \h(f k )\ 2 
2 E[\h k \2] E[\n k \ 2 } E[\h k \2] E[\n k \*} (E[\n k \ 2 ]) N ' 1 ' 

Here £7[|nfc| ] = T coh S n (f k )/2 denotes the noise contribution in a single SFT, which is assumed 
to be stationary in a time scale T coh although it can vary from one SFT to another. In practice 
the expected noise level, in each data stretch, is estimated from the data itself by means of the 
running median ^2] and it is frequency dependent. (£'[|nfc| 2 ]) JV denotes the average noise floor 
of the N different SFTs. 

We typically choose T coh = 30 min. While a larger value of T coh would lead to better sensitivity, 
this time baseline cannot be made arbitrarily large because of the frequency drift caused by the 
Doppler effect and the spin-down; we would like the signal power of a putative signal to be 
concentrated in less than half the frequency resolution l/T coh . It is shown in [I] that at 300 Hz, 
we could ideally choose T coh up to ~ 60 min. On the other hand, we should be able to find a 
significant number of such data stretches during which the interferometers are in lock, the noise is 
stationary, and the data are labeled satisfactory according to certain data quality requirements. 

In order to keep p th \ k /2 constant across the different N data segments, Eq. (|17|) suggests 
that, ignoring the amplitude modulation, a more natural threshold for the peak selection should 
be placed not on p k but on 

\x k \ 2 {E[\h k \ 2 ])j 
E[\h k \ 2 ] E[\h k \ 

The probability of selecting a peak in the absence of a signal in a single SFT would then be: 



Pk = 121 pn^,|21 • ( 18 ) 



q k = exp 



E[\h k \ 

-Ah 



(19) 



thus selecting fewer peaks in the most noisy periods. 

The presence of very noisy periods, if those are not vetoed for the analysis, could in principle 
disturb not just some SFTs but also dominate the average (^[Infcl 2 ])^, and this could degrade 
the sensitivity. Therefore, in practice, (£'[|nfc| 2 ]) JV should be estimated by a more sophisticated 
method, e.g. based on the mean of some sorted points between different quartiles of the different 
SFTs, or the harmonic mean etc. 



4-2. Amplitude modulation corrections 

In Eq. ijnj) the amplitude of the signal \h(f k )\ 2 is proportional to F 2 A+ + F 2 A 2 X , again taking 
a and b to be approximately constant. Averaging over the polarization angle ip 

(••V=^j[ *#(■■■), (20) 

we have = {F 2 )^ = (a 2 + b 2 )/2. Therefore {\h{f k )\ 2 )^ cx a 2 + b 2 and this holds true 

independently of the polarization of the waveform, and also \h(f k )\ 2 oc a 2 + b 2 for circular 
polarization (cos t = 1) independently of ^. 



The functions a and b vary in time due to the motion of the detector, although they can be 
considered constant for the typical time baseline T coh of the SFTs. This suggests that the peak 
selection criteria should be based on setting a threshold on: 

_ \x k \ 2 (E[\n k \ 2 ]) N a 2 + b 2 

Pk E[\h k \ 2 ] E[\h k \ 2 ] (a 2 + b 2 ) N > { ) 

and selecting those frequency bins (peaks) which satisfy p k > p th . The peak selection probability 
in the absence of a signal would be: 

E[\n k \ 2 ] (a 2 + b 2 ) 



q k = exp 



-Pth 



(E[\h k \ 2 ]) N a 2 + b 2 



(22) 



In other words, the threshold on |xfc| 2 /i?[|nfc| 2 ] would be frequency dependent and also different 
for different SFTs and sky locations. 

Since the functions a and b vary for different sky locations and we want to take advantage of 
the implementation method described in [3], for a wide area or an all-sky search, the celestial 
sphere should then be divided into several smaller regions. For each of these small sky-patches, 
the functions a and b should be evaluated at the center of the sky-patch and assumed constant 
for the entire sky-patch. The peak selection procedure needs to be repeated for each individual 
sky-patch we want to analyze. The smaller the patch the more the gain in sensitivity and also 
the smaller the distortions induced by the tiling of the celestial as discussed in |S] . 

The details of the statistical properties of this search will be presented elsewhere. In 
particular, as mentioned earlier, a proper statistical justification of this method in terms of, 
say, a Neyman-Pearson criteria, the regime where the gain in sensitivity is most significant, and 
a large scale Monte-Carlo study are currently being investigated. 
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